Prospect-theoretic Q-learning
نویسندگان
چکیده
We consider a prospect theoretic version of the classical Q-learning algorithm for discounted reward Markov decision processes, wherein controller perceives distorted and noisy future reward, modeled by nonlinearity that accentuates gains under-represents losses relative to reference point. analyze asymptotic behavior scheme analyzing its limiting differential equation using theory monotone dynamical systems infer behavior. Specifically, we show convergence equilibria, establish some qualitative facts about equilibria themselves.
منابع مشابه
Fragility of the Commons under Prospect-Theoretic Risk Attitudes
We study a common-pool resource game where the resource experiences failure with a probability that grows with the aggregate investment in the resource. To capture decision making under such uncertainty, we model each player’s risk preference according to the value function from prospect theory. We show the existence and uniqueness of a pure strategy Nash equilibrium when the players have arbit...
متن کاملNon-Rational Discrete Choice Based On Q-Learning And The Prospect Theory
When modelling human discrete choice the standard approach is to adopt the rational model. This has been shown, however, to fail systematically under some conditions, which makes evident the need for a better approach. The choice model is however only part of the problem because it does not say how to deal with uncertainty, where learning is necessary. In this regard, some evidences support the...
متن کاملReactive Power Compensation Game under Prospect-Theoretic Framing Effects
Reactive power compensation is an important challenge in current and future smart power systems. However, in the context of reactive power compensation, most existing studies assume that customers can assess their compensation value, i.e., Var unit, objectively. In this paper, customers are assumed to make decisions that pertain to reactive power coordination. In consequence, the way in which t...
متن کاملP14: Anxiety Control Using Q-Learning
Anxiety disorders are the most common reasons for referring to specialized clinics. If the response to stress changed, anxiety can be greatly controlled. The most obvious effect of stress occurs on circulatory system especially through sweating. the electrical conductivity of skin or in other words Galvanic Skin Response (GSR) which is dependent on stress level is used; beside this parameter pe...
متن کاملEvaluating project’s completion time with Q-learning
Nowadays project management is a key component in introductory operations management. The educators and the researchers in these areas advocate representing a project as a network and applying the solution approaches for network models to them to assist project managers to monitor their completion. In this paper, we evaluated project’s completion time utilizing the Q-learning algorithm. So the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Systems & Control Letters
سال: 2021
ISSN: ['1872-7956', '0167-6911']
DOI: https://doi.org/10.1016/j.sysconle.2021.105009